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Abstract 

In recent years. Imaging Atmospheric Cherenkov Telescopes (lACTs) have discovered a rich diversity of very high energy (VHE, 
> 100 GeV) y-ray emitters in the sky. These instruments image Cherenkov light emitted by y-ray induced particle cascades in the 
atmosphere. Background from the much more numerous cosmic-ray cascades is efficiently reduced by considering the shape of the 
^ fehower images, and the capability to reduce this background is one of the key aspects that determine the sensitivity of a lACT. In 
Q_|this work we apply a tree classification method to data from the High Energy Stereoscopic System (H.E.S.S.). We show the stabiUty 
bf the method and its capabilities to yield an improved background reduction compared to the H.E.S.S. Standard Analysis. 
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^ 1. Introduction 
HH ■ 

f~| . In the last years ground-based Imaging Atmospheric 
QnCherenkov Telescopes (lACTs) opened a previously inaccessi- 
Q i)le window for the study of astrophysical sources of y radiation 
in the VHE regime. The detection of more than 50 galactic 
C/3 yHE y-ray emitters during the galactic plane scan performed 
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by the H.E.S.S. collaboration between 2004 and 2007 UJ 
decupled the number of known VHE y-ray sources and hence 
established a new field in astronomy. 

] The earth's atmosphere is opaque to VHE photons, which ini- 
tiate electromagnetic particle cascades (Extensive Air Showers, 
|EAS) in the atmosphere. The highly relativistic charged parti- 
cles in the cascade emit Cherenkov light which can be imaged 
via a large mirror onto a fine-grained camera. From the shower 
image one can reconstruct the arrival direction of the primary 
y-ray and calculate its energy using the number of collected 
Cherenkov photons and the directional information. 
^ I One of the big advantages of lACTs is their enormous ef- 
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fective detector area. Modern instruments reach 
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which is five orders of magnitude larger than what is typically 
achieved with satellite-based instruments like EGRET or Fermi 
LAT. While the latter benefit from quasi background free obser- 
vations, Cherenkov telescopes have to deal with a vast number 
of hadronic cosmic -ray background events. The capability to 
suppress these against the y-rays associated with astrophysical 
sources is one of the key aspects that determines the sensitivity 
oflACTs. 

To increase the sensitivity of ground-based VHE y-ray tele- 
scope systems beyond what is obtained with state-of-the-art 
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instruments Uke H.E.S.S. H, MAGIC H, VERITAS H or 
CANGAROO-III 101 larger arrays are needed, as e.g. studied 
by the CTA H and AGIS H consortia. 

Still, for the existing instruments, increased background re- 
duction can improve the sensitivity considerably. With respect 
to the classical - robust but less sensitive - Hillas approach 
ifioll . which parametrises the 2-dimensional elliptical shape of 
the recorded images for reconstruction and selection of y-ray 
like events, sensitivity can be increased by e.g. analysis meth- 
ods which compare the detected images with a 3-dimensional 
photosphere model of the EAS (e.g. 3D Model analysis, in- 
troduced by Lemoine-Goumard et al. iTTl). Furthermore, the 
applicability of multivariate analysis techniques like Random 
Forests 11211 in ground-based VHE astronomy has recently been 



demonstrated 11131 Il4 11511 . 



In this paper we follow the latter approach and discuss the 
application of the Boosted Decision Trees (BDT) method, pro- 
vided by the TMVA package I^T^l, to data obtained by the 
H.E.S.S. experiment. The stability of the technique and its 
capabilities to improve y/hadron separation compared to the 
H.E.S.S. Standard Analysis are demonstrated. After a brief 
description of the method (Chapter |2]i and an introduction of 
the training and evaluation of the BDT method with events 
recorded by the H.E.S.S. experiment in Chapter[3] the applica- 
tion of BDT to H.E.S.S. data is discussed (Chapter|4]i. Finally, 
performance and sensitivity evaluated using Monte-Carlo sim- 
ulations and background data are presented (Chapter|5]l. 

2. Classification using Boosted Decision Trees 

Machine learning algorithms like Neural Networks (NNs), 
LikeUhood Estimators or Fisher discriminants are basically ex- 



These techniques combine several shower parameters into one number 
which gives the likeness of an event with a y-ray or a cosmic ray. 
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Figure 1 : Sketch of a decision tree. An event, described by a parameter set, 
M = ("i/.i,- ■ -."ii.e). undergoes at each node a binary split criterion (passed or 
failed) on one of its parameters until it ends up in a leaf. This leaf marks it as 
signal (S) or background (B). 



tensions of simple cut-based analysis techniques to multivari- 
ate algorithms. They are widely used in natural sciences for 
classification of events of different type which are described by 
a set of input parameters. Beyond the aforementioned tech- 
niques the MiniBooNE 113, O and DO collaborations 
recently utilised the BDT method for particle identification in 
high energy physics, and Bailey et al. 12011 used it for supernova 
searches in optical astronomy. 

One of the main advantages of NNs and BDT compared to 
Likelihood classifiers or Fisher discriminants is the consider- 
ation of nonlinear correlations between input parameters. Fur- 
thermore, the BDT method effectively ignores parameters with- 
out separation power whereas NNs could suffer from those, re- 
sulting in a degraded separation. 



2.1. Basics of the Decision Tree Algorithm 



Decision trees 112 iL 12211 can be represented by a two dimen- 
sional structure like the one sketched in Fig. [1] By applying, 
at each branching, a binary spUt criterion (passed or failed) on 
one of the characterising input parameters they classify events 
of unknown type as signal-like or background-like. The de- 
termination of these criteria is also referred to as training of a 
decision tree, and is performed with a training set consisting 
of events of known type. To circumvent a drawback of single 
decision trees, namely the instability against statistical fluctua- 
tions in the training event set, one extends the single decision 
tree to ?i forest of decision trees, which differ in the binary split 
criteria. A weighted mean vote of the classification of all single 
trees in the forest stabilises the response of the classifier and 
improves its performance. This vote is the output of the BDT 
and describes the signal- or background-likeliness of an event. 
In this work it is referred to as the f variable. The forest of 
trees is obtained by a process called "boosting", starting from 
an initial single tree. 

2.2. The training procedure for a single tree 

The training or building of a decision tree is the process that 
finds the appropriate splitting criterion for each node using a 
training sample, 5, of events of known type. The training sam- 
ple is composed of a signal training sample. Si, and a back- 
ground training sample, 5 2, which consist of A^i signal and N2 



background events, respectively. Each event in the training set 
is characterised by a weighting factor w, and a set of input pa- 
rameters, M, . To build a decision tree from such a training sam- 
ple the following steps are performed: 

• The training samples are normalised in such a way that all 
signal events have the same weight, coiiS \) - 1 IN\, and all 
background events have the same weight a)/(5'2) = 1 IN2. 

• Tree building starts at the root node (top node in Fig. [TJ, 
where one finds the variable and split value that provides 
the best separation of signal and background events. Ac- 
cording to this splitting criterion, S is divided into two sub- 
sets of events that either pass or fail this criterion. Each 
subset is fed into a child node where again the cut param- 
eter which separates best signal and background events is 
determined. 

• This procedure is applied recursively until further splitting 
would not increase the separation, or a preassigned min- 
imum number of events is reached H. According to the 
majority of signal and background events, the last-grown 
nodes (which are called leaves) are assigned signal (S) or 
background (B) type, respectively (see Fig.O. 

2.3. Boosting 

Single decision trees are sensitive to statistical fluctuations 
in the training sample, hence a boosting procedure is applied 
which results in a forest of trees and thus increases the stability 
of the method. In this procedure, events that got misclassified 
in the building of the previous tree are multiplied with a boost 
weight, a, thereby getting a higher weight in the training of the 
next tree. Hence, the boosting is applied to all trees except for 
the first one. This method is known as AdaBoost or adaptive 
boost 12311 . a is calculated from the fraction of misclassified 



events in all leaves, err: 
1 — err 



(1) 



After having applied a to each misclassified event, renormali- 
sation of the training samples retains the sum of weights of ah 
events in a decision tree constant. 

2.4. BDT settings 

In this work we use the BDT method provided by the TMVA 
package. The decision tree settings are mostly default values, 
which have been optimised and tested by the TMVA develop- 
ers. These parameters guarantee a fast training procedure and a 
stable response of the classifier and are marked with a * in the 
following. 

• The number of trees was chosen to be 200*, which is a 
compromise between separation performance and process- 
ing power. Varying this value in a broad range does not 
significantly change the presented results. 



-This avoids overtraining due to statistically insigniticant leaves. 
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• The Gini Index* was used as separation type. It calcu- 
lates the inequality between signal and background distri- 
butions for each value to find the best cut. Other separation 
types were tested and found to achieve similar results. 

• Splitting was stopped when the number of events in a node 
fell below (A^i + N2) I (10 -Np^^)*, taking into account the 
training statistics and the number of training parameters. 
Typical numbers are between 100 and 1000 for the small- 
est and largest data set, respectively. 

• The number of steps used to scan the parameter space for 
the best splitting criterion was increased from 20 to 100 to 
adequately cover training parameters with a large range of 
values. 



3. IVaining and Evaluation of the BDT method 

Having discussed the basic BDT-functioning and details of 
the growing procedure in the last chapter, this section deals with 
the training and evaluation of the BDT method. After an intro- 
duction of the training parameters used in this work, the prop- 
erties of the signal- and background training sample are dis- 
cussed. Finally, tests of the classifiers response are presented. 

3.1. Training parameters 

The recorded EAS images contain pixels which mainly store 
photons from the night sky background (NSB). They are re- 
moved in an image cleaning procedure i24fl for the further im- 
age analysis. Only pixels with an intensity of 5 p.e. and a neigh- 
bouring pixel with more than 10 p.e. (and vice versa) are kept, 
thereby just selecting pixels which contain Cherenkov photons 
originating from the EAS. 

To classify the recorded air-shower events as of either signal- 
or background type, a set of training parameters has been de- 
rived using information from the EAS images. The training 
parameters are based on the Hillas Parameters ifioll which are 
calculated using the second moments of the cleaned shower 
images. Of these, the width, length, and total intensity (also 
called image size) of the ellipse are used for classification. 
Compared to cosmic -ray induced showers, which in general 
exhibit a rather irregular shape, showers produced by y-rays 
(or electrons) have an elliptical, quite regular structure. The 
Hillas Parameters inherently store information about the shape 
of the shower, and can therefore be used to discriminate be- 
tween cosmic -ray and y-ray primaries. Furthermore, an event 
recorded by multiple telescopes is better constrained. To be in- 
dependent of the number of participating telescopes (hereafter 
called multiplicity), the Hillas Parameters of individual tele- 
scopes are averaged. The same is true for all the BDT training 
parameters, presented in the following: 

• One type of training parameters is based on the mean 
reduced scaled width approach introduced by Aharonian 
et al. 12411 . For an image with a given size and recon- 
structed impact distance □ the mean expected width for a 



y-ray (W,) as obtained from y-ray simulations is compared 
to the measured width W,. The Scaled Width for telescope 
/ is then defined as SCW,- = (W/ - <W,))/cr„ with cr; be- 
ing the spread of the expected width. The mean reduced 
scaled width (MRSW) can then be calculated as the aver- 
age SCW over all telescopes: 



MRSW 



1 



z 



2 (SCW,- ■ w,) 



(2) 



taking into account the accuracy of the y-ray simulations 
by introducing a weighting factor w,, defined as w, - 

Similarly, the meayi reduced scaled length (MRSL) is cal- 
culated. By comparing the measured width and length of 
the image with the prediction for an hadronic event Q two 
additional training parameters, the mean reduced scaled 
width off (MRSWO) and mean reduced scaled length off 
(MRSLO), ai-e computed. 

• Another parameter addresses the different interaction 
lengths of photons and hadronic cosmic -rays in the atmo- 
sphere. It is expressed as the depth of the shower max- 
imum Xmax and reconstructed from the recorded shower 
images. Also this parameter is calculated as a weighted 
mean value over all participating telescopes. 

• Because of their irregular structure, the energy of hadron- 
induced showers may be reconstructed differently for tele- 
scopes seeing the shower from different directions. The 
AE / E parameter, calculated as the averaged spread in en- 
ergy reconstruction between the triggered telescopes, adds 
additional separation power to the BDT classification. 

For illustration. Fig. [2] shows all training parameter distribu- 
tions for events with zenith angles around 20° and energies 
0.5TeV<E< 1.0 TeV. 

3.2. Training sample 

The training set used for building the BDT consists of Monte- 
Carlo simulations of y-rays as signal events, and Off-Events as 
cosmic ray background. The y-rays are simulated as resulting 
from a point source, at a fixed distance (offset) of 0.5° from the 
camera centre, and follow an energy distribution dN/dE ~ E""^ 
with index F = 2.0. Since cosmic-rays reach the earth isotrop- 
ically, the Off-Events are homogeneously distributed over the 
field of view of the camera. A cut on the minimum image size 
of 80 p.e. and the maximum distance between the centre-of- 
gravity (COG) of the Hillas ellipse and the camera centre (to 
reduce effects of image truncation) was applied to the training 
sample. This is also referred to as pre-selection and used to 
exclude poorly reconstructed events from the training process. 



^^The distance between the telescope and the impact point of the lengthened 
primary particle track on ground. 



"*These hadronic events are obtained in H.E.S.S. observations of sky re- 
gions without significant y-ray contamination as cosmic-ray background (also 
referred to as Off-Events) 



3 




Figure 2: Distribution of tlie training variables with reconstructed energies between (0.5-1.0) TeV in the zenith angle range (15-25)° for y-rays (black) and 
cosmic-rays (red). 



5.5. Training 

The aim of a BDT classification is a stable y/hadron sep- 
aration over the whole dynamical range of the telescope sys- 
tem, which comprises the accessible energy range as well as 
the observational conditions (e.g. the zenith angle of the ob- 
servation). Since the shower shape changes with the primary 
particle energy and its zenith angle, the distributions of some 
of the input parameters and consequently the response of the 
classifier changes. As opposed to the mean reduced scaled pa- 
rameters which by construction are independent of event energy 
and zenith angle, the depth of the shower maximum and the 
uncertainty in the energy estimation do depend on both these 
quantities. 

This characteristic requires a training of the BDT in energy- 
and zenith angle bands. The energy range accessible for 
H.E.S.S. (from -100 GeV to -100 TeV) was divided into six 
bands, based on the energy reconstructed assuming a y-ray hy- 
pothesis, such that for each of seven zenith angle bands (from 
0° to 60°) the input parameter distributions do not change sig- 
nificantly, and a sufficient number of events for the training pro- 
cess was available. A summary of the training statistics in the 
energy- and zenith angle bands can be found in Table [1] The 
decreasing number of training events with increasing energy 
and/or zenith angle is a direct consequence of the energy spec- 
tra of the training sample and the increased energy threshold of 
the H.E.S.S. system at larger zenith angles. 

As visible from Fig. |2]all parameters show a more or less 
pronounced separation power which manifests itself in a differ- 



ent importance of these variables for the building of the BDT. 
This importance is calculated using the rate of occurrence of a 
splitting variable during the training procedure, weighted by the 
squared separati on-g ain and the number of events in the corre- 
sponding nodes 12111 . Fig. [3] demonstrates that the relative im- 
portance of the training parameters does depend on the energy 
and zenith angle of the event and that this importance changes 
from band to band. 

While the MRSW parameter is generally the most important 
classification variable, this is not true for events with energies 
below a few hundred GeV, since in this energy range hadron- 
and y-initiated showers look similar llzsl l26ll . Here, the X^ax 
parameter provides better separation, because it carries infor- 
mation about the primary particle interaction length without 
taking into account the shape of the shower image. Therefore, 
Xniax is uncorrected with the image shape parameters and an 
important parameter for the y/hadron separation at low energies 
and large zenith angles. 

On the other hand, the spread in event energy reconstruc- 
tion, AE / E, becomes more important for events of high ener- 
gies, because in this energy range y-initiated showers exhibit 
a rather regular shape, whereas hadron-initiated showers show 
large fluctuations and therefore a large spread in the energy re- 
constructed by the participating telescopes. 

Also the MRSWO and MRSLO parameters carry additional 
information about the shower shape. They suffer from the larger 
hadronic shower fluctuations, but nevertheless contribute to a 
significant extent to the training procedure. 
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Table 1: Number of signal- (first value) and background training events (second value) in all trained zenith angle- and energy bands. Events with small energy and 
large zenith angle cannot be reconstructed since the energy threshold of the H.E.S.S. array increases with zenith angle. 
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Figure 4: BDT output for events using an independent test sample (same energy 
and zenith range as in Fig. |2j. 



Beyond the parameters used in this work, additional variables 
which parametrise the intrinsic image properties (e.g. like those 
obtained for the 3D Model analysis 111 ill ) are sensitive to differ- 
ent shower properties and could further improve the BDT clas- 
sification. 



3.4. BDT response 

After having grown the BDT, the classifier's response was 
tested in all zenith angle- and energy bands with an independent 
test sample of signal- and background events. As an example. 
Fig. H shows the result of the classification of this test sample 
with the BDT trained in the (0.5-1.0) TeV band with zenith an- 
gles (15-25)°, demonstrating the excellent classification power 
of the BDT approach in terms of y/hadron separation. However, 
as explained in the last section, some of the input parameters 
depend on zenith angle and energy and therefore the ( distri- 
butions look different from band to band. This later requires 
zenith- and energy-dependent cuts, to make the y/hadron sepa- 
ration independent of the input parameter distributions (Chapter 



4. Systematic studies using H.E.S.S. data 

The consistency between data and simulations is one of the 
key aspects for the analysis of VHE y-ray sources. Since ob- 
servations cover a broad energy range and are performed un- 
der various observational conditions (e.g. different zenith an- 
gles or telescope configurations), the BDT classification has to 
be tested under these conditions. For this purpose, we apply 
the BDT method to H.E.S.S. observations of the Galactic Cen- 
tre (GC) region performed in 2004 and compare the excess of 
y-rays above the background with the predictions from y-ray 
simulations with similar properties. 

4.1. Comparison between simulations and data 



The data set used here is a subset of the GC observations Il27l] 
and accumulates to a total livetime[f| of 11.4 hours. The data 
were selected by zenith angle to cover a smaller range of 15° < 
9 < 25°, thereby avoiding the mixing of y-ray simulations at 
different zenith angles when comparing the results. The mean 
offset of the observations is 1°. In the following we compare 
the y-ray excess of the GC source HESS J1745-290 to y-ray 
simulations at a fixed zenith- and offset angle of 20° and 1°, 
respectively. The energy spectrum of HESS J1745-290 follows 
a power-law in e nerg y with a spectral index of Y-2.2 \ between 
(0.2-10.0) TeV 12811 . and the y-ray simulations are chosen to 
match the same spectral shape in this energy range. 

The ( distributions for events coming from the assumed 
source region (On-Region) and from seven background control 
regions (Off-Regions) □ are shown in Fig. |5] (a). The y-ray 
excess can then be calculated as Ny - Non - a ■ Notr, with Non 
and Notf being the number of events from the On-Region and 
Off-Regions, respectively, and a as normalisation factor which 
accounts for the different geometrical areas of the On-Region 
and Off-Regions. The comparison between y-ray excess and 
simulated y-rays (Fig. |5] (b)) reveals an excellent agreement 
and demonstrates that the BDT classifies both type of events in 
the same way in a broader zenith angle- and energy range. 



The livetime is the observation time corrected for the dead-time of the sys- 
tem. 

*The used background estimation method is known as reflected background 
model m. 
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To illustrate the stability of the BDT classification with re- 
spect to different subsets of events. Fig. |6]shows the comparison 
for events with low energies of 0.2 TeV < E < 0.4 TeV and for 
events which were recorded by just two telescopes. These two 
subsets contain 1/2 and 1/3 of all events, respectively, and are 
difficult to classify, given the limited separational information 
for such kind of events. Even for those, the agreement between 
y-ray simulations and y-ray excess is obvious and confirms the 
robustness of the BDT classification. 

4.2. Spectral analysis with the BDT method 

The last section illustrated that the BDT classification of data 
and simulations leads to consistent results under variation of 
different parameters like the covered energy range or the tele- 
scope multiplicity of the events. Hence, the BDT classification 
can be used to select y-ray-like events for the spectral analysis 
of VHE y-ray sources. 

As aforementioned, the energy- and zenith-dependence of 
some of the input parameters leads to a zenith- and energy de- 
pendent BDT classification. A fixed cut on ^ would accordingly 
lead to different cut efficiencies and hence result in a classifica- 
tion which depends on the observational conditions 0. To cir- 
cumvent this problem, the independent test sample was used 
to predict the y-efficiency of all possible ^ cuts in each zenith 
angle- and energy band. This information was then used to as- 
sign a corresponding y-efficiency to every ^ of an event, ey(^). 



(a) 



In the H.E.S.S. Standard Analysis Il24tl . y-ray selection cuts 




are optimised on MRSW, MRSL, image intensity and ' 
multaneously to maximise the significance (cr, defined in 13C 
Equation (17)). The same optimisation procedure was applied 
to our analysis, but using ey{C,) instead of MRSW and MRSL. 



'On the other hand, cuts on MRSW and MRSL as appHed in the H.E.S.S. 
Standard Analysis neither depend on the event energy nor on the zenith angle 
and hence preserve the cut efficiency. 

**The squared angular distance between the assumed source position and the 
reconstructed shower direction. 



Configuration 


Max 


Max 
(degrees^) 


Size 
Min 
(p.e.) 


Standard 
Hard 


0.84 
0.83 


0.0125 
0.01 


60 
160 



(b) 





MRSW 


MRSL 




Size 


Configuration 


Max 


Max 


Max 


Min 






cr 


(deg2) 


(p.e.) 


Standard 


0.9 


2.0 


0.0125 


80 


Hard 


0.7 


2.0 


0.01 


200 



Table 2: (a): Selection cuts optimised for Configuration Standard (strong, steep 
spectrum sources) and Hai'd (weak, hard spectrum sources) for the f analysis, 
(b): Selection cuts optimised for the Standard and Hard Configurations for the 
H.E.S.S. Standard Analysis (H. Minimum cuts on MRSW and MRSL of -2.0 
are applied in the case of the Standard Analysis. 



Here we optimised for two different sets of assumed strength 
and spectral index of the source, namely the ^ std-cuts (10% 
of the integrated Crab flux above 200 GeV with a spectral in- 
dex of r=2.6) and the ^ hard-cuts (1% of the integrated Crab 
flux above 200 GeV with a spectral index of r=2.0). Together 
with the cuts used in the H.E.S.S. Standard Analysis, which are 
optimised for the same source types, they are summarised and 
described in Tabled 

The optimised ^ std-cuts were applied to the HESS J1745- 
290 data set and a spectrum was extracted. The spectrum ob- 
tained for the arolication of the ( std-cuts and the published dif- 
ferential flux 12811 are shown in Fig. [T] An excellent agreement 
between both results further consolidates the applicability of 
the BDT approach for the analysis of VHE y-ray sources. Ad- 
ditional spectral tests with sources of different spectral shape, 
flux level or source extensions were performed. They show the 
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0.2 (b) 0.4 



Figure 5: (a): f distribution for events from the On-Region (red) and events from tlie Off-Regions (black), weighted by a, from HESS J1745-290 observations, 
(b): Comparison between y-ray simulations (red curve) and y-my excess, normalised to the number of events in the range (0 < ^ < 1). Also shown are the residua 
between the two distributions and the result of a fit of a constant, which is compatible with residuum within the statistical errors and has ax /nd£ of 40/49. 



.S 30 p 




Figure 6: Comparison of f distributions for y-ray simulations and y-ray excess (a): for events with a multiplicity of 2 and (b): for events with reconstructed 
energies 0.2 TeV < E < 0.4 TeV. The lower panel again shows their residua and the result of a fit of a constant. Both fits are compatible with residuum within the 
statistical errors and have a^^/ndf of 57/49 and 43/49, respectively. 
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0.61/0.63 


A c^; /A CT 

0.56/0.5 / 


A ZtZ IC\ C*7 

0.56/0.5 / 


0.60/0.61 


OCA T C A 

25.0 - 35.0 


A O O / A O C 

O.ZZ/O.ZD 


A CI /A CO 

0.51/0.53 


A C A /A A 

0.59/0.60 


A C C /A C*7 

0.55/0.5 / 


A AO lt\ CI 

0.48/0.51 


A C /I /A c 

0.54/0.56 


OCA /IOC 

35.0 - 42.5 


-/- 


A /I C /A /I O 

0.45/0.40 


A C O /A A 

0.50/0.60 


A CO /A CO 

0.52/0.53 


A A A lt\ A ^ 

0.44/0.46 


A A'~l lt\ /I C 

0.43/0.45 


H-Z.D - H- / .D 


-/- 


A 9C/n 98 
U.ZJ/U.Zo 




A c/i /n 


A ^'^ /r\ /ic 




47.5 - 52.5 


-/- 


-/- 


0.47/0.50 


0.48/0.51 


0.36/0.39 


0.38/0.41 


52.5 - 60.0 


-/- 


-/- 


0.29/0.32 


0.46/0.48 


0.38/0.40 


0.35/0.37 



Table 3: <f cuts in all zenith angle and energy bands which correspond to an ey(f) cut of 0.84 {( std-cuts, first value) and 0.83 hard-cuts, second value). 
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Figure 7: Comparison between the fit to the energy spectrum of HESS J1745- 
290 for the July/August 2003 data set (dashed line, (2^1 ) and the spectrum 
obtained with the ^ std-cuts (filled circles). 



std- and hard-cuts after application of the pre-selection and the 
image shape selection. The pre-selection consist of the corre- 
sponding image size cuts and a cut on the distance between the 
COG of the shower image and the camera centre to avoid trun- 
cated images at the camera edge. The image shape cuts com- 
prehend cuts on ( and MRSW, MRSL, respectively (see Table 
|2]and|3]for further information). 

The training in energy- and zenith angle bands leads to a sta- 
ble improvement in separation power for the BDT method and 
makes the zenith- and energy-dependent cuts on ^ well suited 
for this kind of analysis. Especially at energies below a few 
hundred GeV and energies above a few TeV the improvement 
in Q for hard- and std-spectrum sources is remarkable. As a 
result of the training with y-rays simulated at a fixed offset of 
0.5°, the performance of the ^ cuts is reduced for events with 
larger offsets (> 1.5°). However, a training in offset bands re- 
sulted in steps in selection efficiency across the field of view and 
in the description of the camera acceptance, and is therefore not 
employed. 



same agreement between the ( std-cuts and the H.E.S.S. Stan- 
dard Analysis. 

5. Performance and Sensitivity 

Having shown the applicability of the BDT classification un- 
der different observational conditions and for the spectral anal- 
ysis, the performance and sensitivity of BDT is studied on the 
basis of y-ray simulations and Off-Events. 

5.1. Separation power of ( cuts 

An appropriate parameter to quan tify the quaUty of analysis 
cuts is the quality factor Q (e.g. 13 ill ), defined as: 



with e,- = — (i = y or CR), 

Nj 

where the cut efficiency e,- is defined as the number of events 
that pass certain cuts Ni divided by the number of events before 
cuts A^,. Fig. |8]shows the development of Q{/Qsid as a function 
of zenith angle and energy for the ( std- and hard-cuts and the 



5.2. Sensitivity of ^ cuts 

The optimised ^ cuts (introduced in Section 14.21 and Table 
|2ll are applied to y-ray simulations and Off-Events, and their 
sensitivity for strong, std-spectrum and weak, hard-spectrum 
sources was calculated. To disentangle the performance im- 
provement due to the information stored in the additional pa- 
rameters and due to the treatment of non-linear correlations by 
the BDT method, the sensitivity for optimised box cuts on all 
training parameters is also shown. These box cuts are a set of 
one-dimensional cuts on each training parameter which are all 
optimised simultaneously to obtain the best separation between 
signal and background. 

Fig. |9]illustrates the improved separation power of the ( anal- 
ysis compared to the box cuts applied in the H.E.S.S. Standard 
Analysis. Shown is the required observation time for a detec- 
tion (signal with more than 5cr above background) of a point 
source for a range of fluxes, assuming a power-law in energy 
with a spectral index of F = 2.63 as measured for the Crab 
nebula ll24ll (Fig. |9] (a)) and for a hard spectrum source with 
index F = 2.0 (Fig. |9] (b)) for the aforementioned sets of se- 
lection cuts. Remarkably, the optimised ( cuts show the highest 
sensitivity over a wide range of source strengths. The required 
observation time for the ^ analysis is up to 45% and 20% less 
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Figure 9: Sensitivity of tlie H.E.S.S. array for six sets of selection cuts. Sliown is the required observation time to detect a point-like y-ray source with 5<t significance 
above the background, as a function of the flux of the source with spectral index (a) F = 2.63 and (b) T = 2.0 at 20° zenith angle and 0.5° offset. Note, that 
the required observation time is up to 45% and 20% less for the BDT method compared to the H.E.S.S. Standard Analysis for configuration Standard and Hard, 
respectively. Furthermore, no gain in sensitivity is achieved in the case of the six parameter box cuts optimisation for configuration Hard (black curve is hidden 
behind the green, dashed curve). 
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compared to the H.E.S.S. Standard Analysis, for configuration 
Standard and Hard, respectively. It is also clear from Fig. |9] 
that box cuts add just Uttle to the total separation gain, since 
they ignore non-linear correlations in the six training parame- 
ters. 

Since the Xn,ax parameter contributes especially at low ener- 
gies to the classification (see Fig. |3]for comparison), and cuts 
optimised for hard-spectrum sources tend to reject low-energy 
events, the performance improvement of the C, hard-cuts is only 
20% compared to the H.E.S.S. hard-cuts. Nevertheless, the im- 
provement is stable over a wide range of fluxes. One possibil- 
ity to further improve the BDT performance for hard-spectrum 
sources is to find the best match between size cut applied to se- 
lect the training sample (see Section 13.21 for comparison) and 
size cut optimised for a given source type in an alternating pro- 
cess. 

6. Summary and Outlook 

lACTs have to deal with a vast number of hadronic cosmic- 
ray background events. The capability to suppress these against 
the y-rays is one of the aspects which limits the sensitivity of 
lACTS. In this work the training, testing and evaluation of the 
BDT method with H.E.S.S. data was presented. The BDT is a 
multivariate analysis method, which combines the information 
carried in several classification parameters into one parameter 
^. This parameter describes the likeness of an event to be of 
hadronic or electromagnetic origin. Observations of the VHE 
y-ray source HESS J 1745-290 show a very good agreement be- 
tween ^ distribution of the measured y-ray excess and the pre- 
dictions from y-ray simulations for a variety of observational 
conditions. Zenith- and energy-dependent cuts are introduced 
to account for the zenith and energy-dependent classification 
of the BDT. Performance tests have shown a dramatically in- 
creased separation power for the ^ analysis compared to the 
H.E.S.S. Standard Analysis especially for sources with a spec- 
tral index compatible with that measured for the Crab nebula. 

The systematic studies performed in this work and the 
achieved classification power demonstrate that a multivariate 
analysis approach like BDT is well suited for the analysis of 
y-ray data measured with instruments like H.E.S.S.. In near- 
and mid-term projects like H.E.S.S. II, MAGIC II, CTA and 
AGIS the accessible energy range of lACTs is extended as the 
reachable sensitivity increases. Multivariate methods can play 
a major role for the analysis and particularly for the y/hadron 
separation of upcoming instruments. The majority of the events 
will be recorded below a 100 GeV, where y/hadron separation is 
increasingly difficult. In this work, performance of parameters 
such as Xn,ax demonstrate the ability especially for the separa- 
tion at low energies. 
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